MetaLab is a research tool for aggregating across studies in the cognitive development literature. Currently, MetaLab contains 1804 effect sizes across meta-analyses in 22 domains of cognitive development, based on data from 477 papers collecting 28270 subjects. These studies can be used to obtain better estimates of effect sizes across different domains, methods, and ages. Using our power calculator, researchers can use these estimates to plan appropriate sample sizes for prospective studies. More generally, MetaLab can be used as a theoretical tool for exploring patterns in development across language acquisition domains. Learn more here.
Datasets tab.Statistical Approach tab.Field Specification tab.Contribute page.Please note that data and visualizations are under development at the moment (Spring 2018) and should not be taken as definitive.
This page provides an overview of all datasets in MetaLab at this moment. All datasets have more detailed descriptions available through their home pages.
Curator: Molly Lewis
Curator: Francesco Margoni
Curator:
Curator: Hugh Rabagliati
Curator: Christina Bergmann
Curator: Julia Carbajal
Curator: Molly Lewis
Curator: Alex Cristia
Curator: Molly Lewis
Curator: Molly Lewis
Curator: Katie Von Holzen
Curator: Molly Lewis
Curator: Alex Cristia
Curator: Molly Lewis
Curator: Molly Lewis
Curator: Alex Cristia
Curator: Christina Bergmann
Curator: Sho Tsuji
Curator: Angeline Tsui
Curator: Sho Tsuji
Curator: Sho Tsuji
Curator: Christina Bergmann
Comments:
We welcome researchers interested in contributing to Metalab. Please contact us.
Contributions can take various forms:
Check out the tutorial page on how to edit your meta-analysis in the MetaLab format.
All analyses on the site are conducted with the metafor package (Viechtbauer, 2010).
Effect size computation is handled by a script, compute_es.R.
Several pre-existing MAs deal with special cases, and these are listed in the script.
Except where noted, formulas are from Hedges & Olkin’s textbook.
The visualizations page uses a multi-level random effects meta-analysis ( rma.mv function of metafor). Random-effect models assume that the true effect can vary between different studies, and therefore allow random effects for each data point. The model also specifies random effects on the level of each paper, since studies within a paper can be assumed to be more similar to each other than to studies from different papers. In addition, we allow correlated random effects within each paper to account for cases where the same infants contributed data to multiple rows.
The meta-analytic models are accessible in the script server.R
This page gives the full specification for each field in the metalab dataset, including: required fields (which must be included for every MA), optional fields (which are only used for some MAs), and derived fields (which are computed by the site).
MetaLab promotes open science practices. All the datasets of the meta-analyses appearing on the website can be downloaded. If you want to work with one of them, these instructions might be useful:
You can download the data from both the home page and the visualization page.
From the home page
1.1. Click on the box corresponding to the domain you are interested in (e.g. cognitive development, early language).
1.2. Once you are on the page of the domain, click on the box corresponding to the meta-analysis you are interested in (e.g. word segmentation).
1.3. In the data tab, click on the “Download” button.
1.4. Choose the data format that you want (EXCEL - recommended for manual calculations using a spreadsheet software - but you should make sure that your local spreadsheet software is set to use “.” as decimal separator to avoid errors (see section 2.), or CSV - recommended for reading into a statistical software such as R) by clicking on it in the conditional menu.
From the visualization page.
2.1. Open the “Domain” drop-down menu and click on the one you are interested in.
2.2. Open the “Dataset” drop-down menu and click on the meta-analysis you are interested in.
2.3. Click on the “Download data” button above the “Domain” menu. You can also click on “View raw dataset” and follow steps 1.2. and 1.3.
The following instructions assume you are using Excel:
Open a blank workbook
Go to the data menu
Click on “Get external data”
Click on “From text”
A dialog box opens. In the drop-down menu on the right, select “UTF-8”.
Select “Delimited” and click “next”
Select “comma” as column delimiter and click “next”
Click on the the “advanced” button and check that “.” is the decimal separator.
Click OK to finish.
Your research question may be more specific than the one of the meta-analysis. For example your question may focus on a more specific age than the meta-analysis, or you are using a specific method.
Click on “Data” > “Filter” (libre/openoffice have similar menus)
Scroll to the column coding your criteria (e.g. “mean_age_1”)
Click on the little triangle/filter symbol on the right corner of that column
Click on “Standard filter” or “Special filter” or “Custom filter”
In the next dialogue box, choose your condition to be that the column entitled “mean_age_1” is lower than the maximum age (in days) your participants-to-be will have, and greater than your lower age bound.* (We use the conversion: month x 30.42 days/month). If you want to add a second condition, don’t press OK just yet!
To add a second condition: that the “method” is your chosen method (see here for codes). You SHOULD keep only studies with your chosen method, but other variables are left to your judgement. For instance, if I were doing a study on 6 mo using Central Fixation, then my filters could be: method=“CF” AND 120 < mean_age_1 <=365 (so babies between about 4 months and 1 year).
You can add as many conditions as you want (e.g., exclude some more studies because they use unusual stimuli, or a very different design from yours).
When you’re done entering inclusion conditions, you can click OK. The result will be a set of rows that contain all the effect sizes meeting your conditions.
Two possibilities:
Compute the mean or the median of column “n_1”, and “n_1” and “n_2” if you are running a between participants study. You can thus see how many infants per group you should test based on the sample sizes used in previous studies. We do not recommend this strategy.
1.1. In a new cell (typically below the last row of the “n_1” and “n_2” columns), type “=Mean([coordinate (i.e. letter+number) of the first value]:[coordinate (i.e. letter+number) of the last value])” or “=median([coordinate (i.e. letter+number) of the first value]:[coordinate (i.e. letter+number) of the last value])”. You can also type “=mean(“ and select the the cells to average with the cursor. Make sure you’ve closed the bracket and press enter.
1.2. Note that if the variable “n_excluded_1” is coded in the meta-analysis you are looking at, you can also get an idea of the attrition rates found in this work. For instance, you can add a column, say called “n_total”, that sums n_1 and n_excluded_1. This will give you an idea of how many babies you should recruit, taking into account that some of them will need to be excluded (because they are fussy or other reasons).
1.3. If n_excluded_1 is not present, you can use a 20% rule of thumb - i.e., recruit 20% more than your target sample size. You can get an idea of the attrition rate by method and age group on our design choices analyses page.
Previous studies might have chosen their sample size based on practical, instead of statistical reasons -e.g. number of available participants within a predefined time range. Therefore they might be underpowered. To solve the problem on your study, you can do a prospective power analysis.
2.1. Copy the column entitled d_calc into a new sheet.
2.2. Calculate the median values for these columns – in my example, the median difference score is 0.91, the median pooled SD is 2.72, and the median effect size is 0.33.
2.3. Use your favorite power calculation system/online tool to estimate sample size. I used the following formula in R (package: pwr): power.t.test(delta=0.91,sd=2.72,power=0.8,sig=.05,type=c(“paired”),alternative=“two.sided”) to find out I needed to test 72 infants for my example case.
If the number you get is too high, you might consider running sequential analyses. This means that you fully design your study, including a predetermined number of participants to run, the statistical analyses that you will run, and predefine moments when you will look at p-values before the end of data collection. At these predefined moments you will decide to continue or stop data collection based on outcome (significance reached, or fell below the threshold).
You can also define a stopping rule related to your data collection situation - e.g. “Data collection will stop when all conditions are balanced (N=72) or by Dec 31, 2018 (date when my internship ends), whichever happens first”. Notice that neither of these options takes power into consideration - they are not alternatives to power, they just help you avoid the questionable practice of stopping data collection when you have p<.05.
You must cite the publications linked to that dataset, as listed in the Documentation. (If there is no citation, then no citation for the individual dataset is necessary).
For tracking use of MetaLab, please also cite at least one of the following:
Bergmann, C., Tsuji, S., Piccinini, P. E., Lewis, M. L., Braginsky, M., Frank, M. C., & Cristia, A. (2018). Promoting replicability in developmental research through meta-analyses: Insights from language acquisition research. Child Development. (https://osf.io/uhv3d/)[Repository]
Lewis, M. L., Braginsky, M., Tsuji, S., Bergmann, C., Piccinini, P. E., Cristia, A., & Frank, M. C. (2017/under review). A Quantitative Synthesis of Early Language Acquisition Using Meta-Analysis. DOI: 10.17605/OSF.IO/HTSJM (https://psyarxiv.com/htsjm)[Preprint]
If researchers use more than five datasets, we ask users as a courtesy to cite all data. But in case of severe space limitations, the database as a whole alone may be cited.
Since the MetaLab site is dynamic, we recommend that you list the date of download for your data in your manuscript. For example: “We analyze all data currently in MetaLab (Bergmann et al., 2018). Data were downloaded on 06/17/17.”
We are collecting frequent problems here. If you experience any issues that we do not cover when following the steps, please send an email to [metalab-project@googlegroups.com] and we will try to help you.
Make sure you selected UTF-8 as the encoding when you imported the data (see steps 2.1 to 2.5).
This is due to a faulty import, you need to do it again:
Open a blank workbook
Go to the data menu
Click on “Get external data”
Click on “From text”
A dialog box opens. In the drop-down menu on the right, select “UTF-8”.
Select “Delimited” and click “next”
Select “comma” as column delimiter and click “next”
Click on the the “advanced” button and check that “.” is the decimal separator.
Click OK to finish.
Check that . is the decimal separator:
Select the column where numbers are not formatted properly (if there are several, you need to treat them one by one).
Click on “data” > “Text to columns”. A dialog box opens
Select “Delimited” and click “next”
Select “comma” as column delimiter and click “next”
Click on the the “advanced” button and check that “.” is the decimal separator. Click OK to finish.
select column A (that contains all the data)
Click on “data” > “Text to columns”. A dialog box opens
Select “Delimited” and click “next”
Select “comma” as column delimiter and click “next”
Click on the the “advanced” button and check that “.” is the decimal separator. Click OK to finish.